Segmented nestedness in binary data

نویسندگان

  • Esa Junttila
  • Petteri Kaski
چکیده

A binary matrix is fully nested if its columns form a chain of subsets; that is, any two columns are ordered by the subset relation, where we view each column as a subset of the rows indicated by the 1-entries. A binary matrix is k-nested if its columns can be partitioned into k pairwise disjoint blocks, each of which is fully nested. Such nested patterns are encountered, for example, in presence/absence patterns of species in ecological data. We study the automated discovery of k-nestedness on synthetic data and real ecological data. First, we show that k-nestedness can be efficiently discovered in a noisefree setting using a polynomial-time algorithm. Second, we show that it is NP-hard to find a k-nested matrix that minimizes the Hamming distance to a given dataset. Thus, it is likely that in the presence of noise no efficient algorithm exists for discovering k-nestedness in the general case. Third, we develop and evaluate multiple heuristic algorithms for discovering k-nestedness on noisy synthetic data. The methods based on a combination of singular value decomposition and k-means++ give the best performance in terms of structure discovery and noise tolerance. Fourth, we develop an MDL-based model selection technique for assessing nestedness, and discover k-nested structure in (a) paleontological data, and (b) geographical occurrence data for mammal species in Europe.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The ghost of nestedness in ecological networks.

Ecologists are fascinated by the prevalence of nestedness in biogeographic and community data, where it is thought to promote biodiversity in mutualistic systems. Traditionally, nestedness has been treated in a binary sense: species and their interactions are either present or absent, neglecting information on abundances and interaction frequencies. Extending nestedness to quantitative data fac...

متن کامل

Improving the analyses of nestedness for large sets of matrices

Nestedness is a property of binary matrices of ecological data and quantified by the matrix’s temperature, T. The program widely used to calculate T is Nestedness Temperature Calculator (NTC). NTC analyses matrices individually, turning the analysis of large sets time-consuming. We introduce ANINHADO, a program developed to perform rapid and automatic calculation of T over 10,000 matrices. ANIN...

متن کامل

FALCON: a software package for analysis of nestedness in bipartite networks

Nestedness is a statistical measure used to interpret bipartite interaction data in several ecological and evolutionary contexts, e.g. biogeography (species-site relationships) and species interactions (plant-pollinator and host-parasite networks). Multiple methods have been used to evaluate nestedness, which differ in how the metrics for nestedness are determined. Furthermore, several differen...

متن کامل

Patterns in permuted binary matrices

Reorganizing a dataset so that its hidden structure can be observed is useful in any data analysis task. For example, detecting a regularity in a dataset helps us to interpret the data, compress the data, and explain the processes behind the data. We study datasets that come in the form of binary matrices (tables with 0s and 1s). Our goal is to develop automatic methods that bring out certain p...

متن کامل

Null model analysis of species nestedness patterns.

Nestedness is a common biogeographic pattern in which small communities form proper subsets of large communities. However, the detection of nestedness in binary presence-absence matrices will be affected by both the metric used to quantify nestedness and the reference null distribution. In this study, we assessed the statistical performance of eight nestedness metrics and six null model algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011